Among the many satellite missions is the Suomi National Polar-orbiting Partnership (Suomi NPP), which carries five state-of-the-art instruments that help understand the Earth in unprecedented detail. Among these instruments is the Visible Infrared Imaging Radiometer Suite (VIIRS) that is designed to provide moderate-resolution, radiometrically accurate images of the entire Earth twice daily. The VIIRS instrument collects a variety of data corresponding to different bandwidths of light. These in turn can be combined through well-tuned remote sensing algorithms to monitor and analyze a robust set of environmental conditions, including aerosols, clouds, land (fires, temperature), and ocean (ocean color and sea temperature). The immediate scientific contributions have been many and the applications in fields beyond earth science are largely untapped.

The VIIRS Day/Night Bands (DNB) collect levels of nighttime light across the globe at 750-meter resolution, which can be used to proxy popualation among other demographic and economic indicators. The data is available in a number of forms:
For ease of analysis, we’ll focus on the monthly composites. To illustrate the value, the 35 largest US cities by population are mapped and colored in order to reveal the intricate detail and gradations in each local urban environment. Within each city, the relative intensity of light is indicative of the differences in human activity: brighter yellow indicates relatively more population acitivity and darker blue indicates relatively less activity. The development patterns are distinct in each city, contoured to the waterways and roadways. The central business districts are appear in brilliant yellow clusters.

In the map above, light intensity is scaled relative to each respective city's light distribution. Placing all cities on the same light intensity scale allows for between city comparisions that reveal relative differences in human activity across the country. Whereas the yellow areas were relatively contained in the previous map, the new map shows yellow areas that are far more widespread in large cities like New York and Los Angeles, indicating that neighborhoods within those cities are among the brightest in the United States. Occassional specks of black indicate areas where the emitted light is "off the charts" and were not well sampled for the intervals calculation due to their rarity.

Sec 3: Mapping
At this point, we’ll begin to map the top 35 most populous cities, scaling the color palette to reveal local detail within each city. The top 35 cities are listed below:
cities <- c("New York, NY", "Los Angeles, CA","Chicago, IL", "Houston, TX",
"Philadelphia, PA", "Phoenix, AZ", "San Antonio, TX", "San Diego, CA",
"Dallas, TX", "San Jose, CA", "Austin, TX", "Jacksonville, FL",
"San Francisco, CA", "Indianapolis, IN", "Columbus, OH", "Fort Worth, TX",
"Charlotte, NC", "Detroit, MI", "El Paso, TX", "Seattle, WA",
"Denver, CO","Washington, DC", "Memphis, TN", "Boston, MA",
"Nashville, TN", "Baltimore, MD", "Oklahoma City, OK", "Portland, OR",
"Las Vegas, NV", "Louisville, KY","Milwaukee, WI","Albuquerque, NM",
"Tucson, AZ","Fresno, CA","Sacramento, CA")
Next, we’ll set up the graph layout with no margins (mai), with 7 rows and 5 columns (mfrow), with a navy blue background (bg).
##Set graph layout
par(mai=c(0,0,0,0),mfrow = c(7,5),bg='#001a4d', bty='n')
Now let’s get looping to create a montage of city maps. The process is as follows:
- First set a placeholder dataframe for coordinates
- For each city, geocode the city name for the centroid using the geocode() function from ggmaps, append the coordinates to the placeholder. This coordinates file will be used again later.
- Use the extent function to specify the spatial bounding box (the frame around the city) as +/-1 degree longitude and +/-0.25 degree latitude.
- Crop the raster file (rast) by the bounding box.
- Convert the cropped raster into a vector in order to get the radiances, then run a k-means clustering algorithm to find natural intervals within the radiance distribution. For each cluster, extract the maximum radiance.
- Map the city with a navy to yellow color palette with intervals from the k-means clustering.
##Loop through data
coords <- data.frame() ##place holder
for(i in 1:length(cities)){
##Coords
temp_coord <- geocode(cities[i], source = "google")
coords <- rbind(coords,temp_coord)
e <- extent(temp_coord$lon - 1, temp_coord$lon + 1,
temp_coord$lat - 0.25, temp_coord$lat + 0.25)
rc <- crop(rast, e)
##Rescale brackets
sampled <- as.vector(rc)
clusters <- 15
clust <- kmeans(sampled,clusters)$cluster
combined <- as.data.frame(cbind(sampled,clust))
brk <- sort(aggregate(combined[,1], list(combined[,2]), max)[,2])
#Plots
plot(rc, breaks=brk, col=colorRampPalette(c("#001a4d","#0066FF", "yellow"))(clusters),
legend=F,yaxt='n',xaxt='n',frame = F, asp=1.5)
text(temp_coord$lon ,temp_coord$lat + 0.15,
substr(cities[i],1,regexpr(",",cities[i])-1),
col="white", cex=1.25)
rm(combined)
}
Whereas the map above shows city-specific patterns, the second map focuses on comparing patterns across cities using the same color coding. Rather than running a k-means algorithm for each city in order to generate the radiance intervals, one interval is generated based on a random sample of pixels from across the country.
#Set layout
par(mai=c(0,0,0,0),mfrow = c(7,5),bg='#001a4d', bty='n')
#Run clustering
set.seed(123) #set seed for reproducibility
sampled <- sample(rast, 20000) #sample 20,000 pixels
clusters <- 15 ##15 clusters
clust <- kmeans(sampled,clusters)$cluster
combined <- as.data.frame(cbind(sampled,clust))
brk <- sort(aggregate(combined[,1], list(combined[,2]), max)[,2])
##Loop through each city
for(i in 1:length(cities)){
temp_coord <- coords[i,] ##re-use the coordinates
e <- extent(temp_coord$lon - 1, temp_coord$lon + 1,
temp_coord$lat - 0.25, temp_coord$lat + 0.25)
rc <- crop(rast, e)
#Plots
plot(rc, breaks=brk, col=colorRampPalette(c("#001a4d","#0066FF", "yellow"))(clusters),
legend=F,yaxt='n',xaxt='n',frame = F, asp=1.5)
text(temp_coord$lon ,temp_coord$lat + 0.15,
substr(cities[i],1,regexpr(",",cities[i])-1),
col="white", cex=1.25)
rm(combined)
}
Sec 5: Visualizations
Histogram by select MSA
For a quick example, we’ve selected five MSAs that are serially processed into histograms of logarithmically transformed radiance. The data is processed first for each of the five MSAs, then turned into a series of histograms using a combination of ggplot and plotly.
##MSAs by GEOID
msa_list <- c(16180,19140,45820,42540,35620)
##Placeholder
radiances <- data.frame()
##Loop MSA file
for(i in msa_list){
print(i)
#Extract MSA i polygon
shp_temp <- msa[msa@data$GEOID==i,]
#Extract MSA abbreviated name
if(regexpr("-",as.character(shp_temp@data$NAME)[1])[1]==-1){
loc = as.character(substr(as.character(shp_temp@data$NAME)[1],1,regexpr(",",as.character(shp_temp@data$NAME)[1])-1))
} else{
loc = as.character(substr(as.character(shp_temp@data$NAME)[1],1,regexpr("-",as.character(shp_temp@data$NAME)[1])-1))
}
#Extract the radiances, append to radiances placeholder
rad <- masq(shp_temp,rast,1)$avg_rad
temp <- data.frame(loc = as.character(paste(loc,"(TNL = ",round(sum(rad),0),")",sep="")), avg_rad = rad)
radiances <- rbind(radiances,temp)
}
#Use ggplot to create histograms by MSA group. Preload.
ggplot(radiances, aes(x=log(avg_rad))) +
geom_histogram(position="identity", alpha=0.4) +
facet_grid(. ~ loc)
#Remove all axes labels for style
x <- list(
zeroline = FALSE,
showline = FALSE,
showticklabels = FALSE,
showgrid = FALSE
)
y <- list(
zeroline = FALSE,
showline = FALSE,
showticklabels = FALSE,
showgrid = FALSE
)
#Initiate a plotly graph without axes
ggplotly() %>% layout(xaxis=x, yaxis=y)
Scatterplot of TNL vs. Population
To extract radiance data from five MSAs take a few seconds, but a few hundred MSAs takes a while longer. We rely on the doParallel and foreach libraries to parallelize data extraction. Below, two processor cores are used to extract the TNL for each MSA and record into a consolidated data frame of MSAs and TNL. This data frame is then joined with the MSA population files to allow direct comparison between population and TNL. The ultimate interactive graph illustrates a clear positive correlation between the two quantities.
##Set up comparisons
registerDoParallel(cores=2)
extract <- foreach(i=1:nrow(msa@data), .combine=rbind) %dopar% {
data <- masq(msa,rast,i)
data.frame(GEOID = data$GEOID[1],sum = sum(data$avg_rad))
}
extract$GEOID <- as.numeric(as.character(extract$GEOID))
##Join in data
joined<-merge(extract, msa_pop[,c("CBSA","NAME","POPESTIMATE2014")],by.x="GEOID",by.y="CBSA")
colnames(joined) <- c("GEOID","TNL","MSA","Population")
plot_ly(joined,
x = log(TNL),
y = log(Population),
text = paste("MSA: ", MSA),
mode = "markers",
color = TNL,colors="PuOr") %>%
layout(title="Total Nighttime Light vs. Population", showlegend = F)
Basic joining is just the beginning. In an upcoming tutorial, we will explore mining for patterns using radiance data and a large feature set among other topics.